Combining Parallel Treebanks and Geo-Tagging

نویسندگان

  • Martin Volk
  • Anne Göhring
  • Torsten Marek
چکیده

This paper describes a new kind of semantic annotation in parallel treebanks. We build French-German parallel treebanks of mountaineering reports, a text genre that abounds with geographical names which we classify and ground with reference to a large gazetteer of Swiss toponyms. We discuss the challenges in obtaining a high recall and precision in automatic grounding, and sketch how we represent the grounding information in our treebank.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A non-projective greedy dependency parser with bidirectional LSTMs

The LyS-FASTPARSE team presents BIST-COVINGTON, a neural implementation of the Covington (2001) algorithm for non-projective dependency parsing. The bidirectional LSTM approach by Kiperwasser and Goldberg (2016) is used to train a greedy parser with a dynamic oracle to mitigate error propagation. The model participated in the CoNLL 2017 UD Shared Task. In spite of not using any ensemble methods...

متن کامل

Deletions and their reconstruction in tectogrammatical syntactic tagging of very large corpora

The procedure of reconstruction of the underlying structure of sentences (in the process of tagging a very large corpus of Czech) is described, with a special attention paid to the conditions under which the reconstruction of ellipted nodes is carried out. 1. The tagging scenarios with different (degrees and types of) theoretical backgrounds have undergone a rather rapid development flom morpho...

متن کامل

Learning Translations for Tagged Words: Extending the Translation Lexicon of an ITG for Low Resource Languages

We tackle the challenge of learning part-ofspeech classified translations as part of an inversion transduction grammar, by learning translations for English words with known part-of-speech tags, both from existing translation lexica and from parallel corpora. When translating from a low resource language into English, we can expect to have rich resources for English, such as treebanks, and smal...

متن کامل

Suggestive Geo-Tagging Assistance for Geo-Collaboration Tools

An argumentation map is an online discussion forum for spatially related topics that combines the forum with an interactive map. The utility of an argumentation mapping tool highly depends on the accuracy and quantity of the geo-tags that link the discussion contributions to geographic locations. These geo-tags can be created manually by the users of the argumentation map or automatically by a ...

متن کامل

Urdu and Hindi: Translation and sharing of linguistic resources

Hindi and Urdu share a common phonology, morphology and grammar but are written in different scripts. In addition, the vocabularies have also diverged significantly especially in the written form. In this paper we show that we can get reasonable quality translations (we estimated the Translation Error rate at 18%) between the two languages even in absence of a parallel corpus. Linguistic resour...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010